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Abstract 

Classically, communication systems are designed assum- 
ing perfect channel state information at the receiver and/or 
transmitter. However, in many practical situations, only an 
estimate of the channel is available that differs from the true 
channel. We address this channel mismatch scenario by in- 
troducing the notion of estimation-induced outage capac- 
ity, for which we provide an associated coding theorem and 
its strong converse, assuming a discrete memoryless chan- 
nel. The transmitter and receiver strive to construct codes 
for ensuring reliable communication with a quality of ser- 
vice (QoS), in terms of achieving a target rate with small 
error probability, no matter which degree of accuracy chan- 
nel estimation arises during a transmission. We illustrate 
our ideas via numerical simulations for transmissions over 
Ricean fading channels using rate-limited feedback channel 
and maximum likelihood (ML) channel estimation. Our re- 
sults provide intuitive insights on the impact of the channel 
estimate and the channel characteristics (SNR, Ricean K- 
factor, training sequence length, feedback rate, etc.) on the 
mean outage capacity. 

1. Introduction 

Channel uncertainty, caused e.g. by time varia- 
tions/fading, interference, or channel estimation errors, can 
severely impair the performance of wireless systems. Even 
if the channel is quasi-static and interference is small, uncer- 
tainty induced by imperfect channel state information (CSI) 
remains. This motivates us to study the design of commu- 
nication systems which require to ensure information trans- 
mission at a target rate satisfying a quality of service (QoS), 
i.e. reliable communication, no matter which degree of ac- 
curacy channel estimation arises during the communication. 

We first review the model for communication under 
channel uncertainty over a discrete memoryless channel 
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(DMC) with finite input alphabet ^ and output alphabet 
[1]. A specific instance of the unknown channel is char- 
acterized by a transition probability mass (PM) W{-\x,9) g 
We with an unknown channel state 6 ^ Q C C^. Here, 
We = {W{-\x,9) : X e ^, d e e} is a family of 
conditional transition PMs on '3^, parameterized by a vec- 
tor 6* e 8. By considering a composite channel model, 
through the notion of reliable communication based on the 
average of the error probability over all channel estimation 
errors. Capacity bounds for additive white Gaussian noise 
(AWGN) channels with MMSE channel estimation, i.e. im- 
perfect CSI at the receiver (CSIR), and no CSI at the trans- 
mitter (CSIT) were derived in [2]. 

Throughout the paper we assume that the channel state, 
which neither the transmitter nor the receiver knows ex- 
actly, is constant within blocks of duration T symbol pe- 
riods (coherence time), and channel states in different 
blocks are i.i.d. 9 ~ The extension of the DMC 

W{-\x,6) to n channel uses within a block is given by 
W"{y\^,e) = nr=iW^(yda^.,^) where X = (xi,...,a;„) 
and y = (yi, . . . , ?/„). The receiver only knows an estimate 
0]i of the channel state and a characterization of the estima- 
tor performance in terms of the conditional probability den- 
sity function (pdf) iI.'{9\9b) (this can be obtained using We, 
the estimator function and the a priori distribution of 9). 
Moreover, a feedback channel provides the transmitter with 
noisy CSI 9t {9t in general is different from 6'/?, e.g. due to 
quantization). The joint distribution of (6't, 9^, 9) is given 
by ■ip{9T,9f{,,9). The scenario underlying these assump- 
tions is motivated by current wireless systems, where for 
the case of a mobile receiver T may be too short to permit 
reliable estimation of the fading coefficients. 

The concept of outage capacity was first proposed in [3] 
for fading channels. It is defined as the maximum rate that 
can be supported with probability 1 — 7, where 7 is a pre- 
scribed outage probability. In contrast, ergodic capacity is 
the maximum information rate for which error probability 
decays exponentially with the code length. 

In our setting, a transceiver using 9 ~ {9b., 9t) instead 



of 6 obviously might not support an information rate R even 
if R is less than the channel's capacity under perfect CSIR 
(even arbitrarily small rates might not be supported if 6 and 
9 happen to be strongly different). Consequently, outages 
induced by channel estimation errors will occur with a cer- 
tain probability 7. The outage probability depends on the 
codeword error probability, averaged over a random coding 
ensemble and over all channel realizations given the esti- 
mated state. We first formalize the notion of estimation- 
induced outage capacity for general DMCs, and then we 
present a coding theorem providing the explicit expression 
for the corresponding capacity, which is a function of the 
outage probability 7 (Section 2). Due to the independence 
of different blocks (coherence intervals), it is sufficient to 
study the estimation-induced outage rate C{'y,tpg^g,d) for 
a single block (coherence interval), for which the channel 
state is fixed but unknown to the transmitter and the re- 
ceiver. Since this rate still depends on the random channel 
estimates 9, we will consider the performance measure 



C(7>,|^)=i?«{C(7,V«|,^^)}, 



(1) 



which describes (average) information rate with prescribed 
outage probability. The expectation in ([TJ is with respect to 
the joint distribution ^A(^T|^i?) jQ^p{eR\e)iie)d9 

and reflects an average over a large number of blocks (co- 
herence intervals), cf. the discussion in [4]. 

Our notion of reliably communication is relevant e.g. for 
communication systems where a quality of service (QoS) in 
terms of error performance must be ensured although sig- 
nificant channel variations occur due to user mobility. An 
example of such a scenario involving a fading Ricean chan- 
nel with AWGN, rate-limited feedback, and maximum like- 
lihood (ML) channel estimation, will be considered in Sec- 
tion 4 to illustrate the mean outage capacity (7(7, tpg^§)- 

2. Problem Statement and Main Result 

In this section, we first develop a proper formalization of 
the notion of estimation-induced outage capacity and state 
our main result. 

2.1. Problem Definition 

A message m from the set = {!,..., [exp(ni?)J} 
is transmitted using a length-71 block code defined as a pair 
{(fi, (j)) of mappings, where ip : Ai x Q is the en- 

coder (that utilizes 9t), and : x 8 f-> U {0} is the 
decoder (that utilizes 9fj). The random rate, which depends 
on the unknown channel realization 9 through its probabil- 
ity of error, is given by 71^^ log Mg g. The maximum (over 
all messages) error probability 



emax(v, 0, ^;^') = max 

vieM ^ — ' 

yeS'":0(y,ej,)#m 



W"{y\^{mjT),0). 



For a given channel estimate 9 = {9r,9t), and < 
e, 7 < 1, an outage rate i? > is (e, 7)-achievable on an 
unknown channel M^(-|a;, 9) G We, if for every 6 > and 
every sufficiently large n there exists a sequence of length-?! 
block codes such that the rate satisfies 



Pr ({9 e A, : n~HogMg g > R-S}\9) > 1 



■ 7, 



where = {0 G 6 : einax(<y5, (p, 9; 9) < e} is the set of all 
channel states allowing for reliable decoding. This defini- 
tion requires that maximum error probabilities larger than e 
occur with probability less than 7, i.e., Pg|g(Ag|0) > 1 — 7. 
The practical advantage of such definition is that for any de- 
gree of accuracy channel estimation, the transmitter and re- 
ceiver strive to construct codes for ensuring reliable com- 
munication with probability 1 — 7, no matter which un- 
known state 9 arises during the transmission. 

A rate i? > is 7-achievable if it is (e, 7)-achievable for 
every < e < 1. Let Ce(7, ^) be the largest (£,7)- 
achievable rate for an outage probability 7 and a given es- 
timated 9. The estimation- induced outage capacity of this 
channel is then defined as the largest 7-achievable rate, i.e.. 



2.2. Coding Theorem 



We next state a coding theorem quantifying the 
estimation-induced outage capacity C{'y,il!g^g,9) for our 

scenario where an estimate 9fj of the channel state is known 
at the decoder and a noisy version 9t of 9]i is known at the 
encoder We impose an input constraint that depends on the 
transmitter CSI and requires that r(P) = J^xex ^{x)P{x) 
is less than V{9t)- Here, r(-) is any arbitrary non-negative 
function, and P{-) denotes the input distribution. 

Theorem 2.1 GivenO < 7 < \ the estimation-induced out- 
age capacity of an unknown DMC 9) G We is given 

by 

C(7,^,|e,^)= max _ ^{^,4>g\g.e,P), (2) 

where 



P:r{P)<V{9T) 



'^h:^g,g,9,P)^ sup mfl{P,Wi-\;9)). (3) 



Ace: Pr(A|e)>l-7' 



In addition, Ce{j,4^gig,9) = C{^,iJ.' 



9\6 



V < e < L 



l{P,W{-\;9)) 



In this theorem, we used the mutual information 

xexyey ' 
with Q{y\9) = J2xex Pi^)V'^^iy\^^ We emphasize that 
the supremum in (|3]l is taken over all subsets A of 6 that 
have (conditional) probability at least 1 — 7. Furthermore, 
codes achieving capacity (O can be viewed as codes for a 
simultaneous channel Wa* , which has been determined by 



the decoder. Hence, this outage capacity C{'y,ipg^g,9) is 
seen to equal the maximum capacity of all compound chan- 
nels that are contained in We and, conditioned on 9, have 
sufficiently high probability. The significance of Theorem 
12. H is that it provides an explicit way to evaluate the outage 
capacity for an unknown but estimated channel for arbitrary 
estimation accuracies without additional assumptions. 

Observe that if perfect CSIR is available then = 
8 and the instantaneous mutual information is attainable. 
Thus, every rate R can be associated to the set Ajj = {6* G 
e : I{P, W{-\-, 9)) > R- 6} whose probability is 1 - 7. 
Therefore, in this case, the channel can be modeled as a 
compound channel, whose transition probability depends on 
a random parameter 6* G 8. In the following section we pro- 
vide a proof of Theorem l2.1l 

3. Proof of the Coding Theorem 

In this section we determine the capacity by using the 
tools of information theory, according to the definition in 
Section 2. The proof of Theorem 12. H is based on an exten- 
sion of the maximal code lemma [5] to bound the minimum 
size of the images for the considered channels, according to 
the notion of estimation-induced outage capacity. 

Throughout this section, we will use the notion of 
(conditional) information-typical (I-typical) sets defined in 
terms of (Kullback-Leibler) divergence, i.e., Tp{5) ~ {x: 
V{Pn\\P) < S} andT^(x,^) = {y: V{WjW\Pn) ^ 6} 
where P„ is the empirical PM associated with x and Wn is 
the empirical conditional PM associated with x and y. 

3.1. Generalized Maximal Code Lemma 

Let ^ denote the set of all common 77-images C 
associated to a set C via the collection of 
simultaneous DMCs Wa, 

inf iy"(^"|x,6i) > ?/ for all x G | . 
In the following, we will denote by 

gAK"^)= min l|^"|| (4) 

the minimum of the cardinalities of all common 77- 
images For a given channel estimate 6 = 
[9 ^,9 J with degraded CSIT 9 <^ 6^^ <^ 6^, a code 
(xi {9^ ),..., XM (0" J ; (9),..., according ^ to 
the above definition consists of a set of codewords x™ {9^ ) 
and associated decoding sets S>^{9) (i.e., the decoder reads 
(/>(y,^) = m iff y G ^,"(^)). For any set ia/", we call 
a code admissible if Xm{9^) G all decoding sets 
^mi^) — are mutually disjoint, and the set 

A, = G 8 : max M/"((^,"„(0"))=|x„(0"j, 0) < e|, 

(5) 



satisfies that Pr(Ae|6') > 1 — 7. Any input distribution sat- 
isfying the input constraint 'P{9rp ) is denoted by P( - 16*^ ). 

Theorem 3.1 Let two arbitrary numbers < e,5 < I be 
given. There exists a positive integer uq such that for all 
n > no the following two statements hold. 

1) Direct Part: For any C T" - {S, 9^) and any 

random set A C 8 with Pr(A|0) > 1 — 7, there exists an 
admissible sequence of length-n block codes of size 

Mgj > exp [ - n{H{WA\P) - S)]gj,{^'' ,e - S), (6) 

for all 9 E A, where A^ = A. 



2) Converse Part: For 
any admissible sequence of length-n block codes is bounded 



T^P^e^i^^^Tl the size of 



Mg g < exp [ - n{H{WA, \P) + S)]g^^ (^", e + <5), (7) 

for all 9 G Aj. 

The proof of this theorem easily follows from basic 
properties of I-typical sequences and the concept of robust 
I-typical sets in Appendix lAl Whereas, Theorem 12. II is ob- 
tained through the following corollary. 

Corollary 3.2 For a given channel estimate 9 and an out- 
age probability 7, and < e, S < 1 and any PM P{-\9^) G 
Vi^). Let "^(7, ^^ P) defined by expression 
Then the following statements holds: 

( i) There exists an optimal sequence of block codes of 
length n and size Mg g, whose maximum error probabilities 
larger than e occur with probability less than 7, such that 



Pr 



logMgg >R-25\9] > 1 - 7 



(8) 



for all rate R < '^{"f, 'fpg^gj ^7 P)' provided that n > no- 

( ii) For any block codes of length n, size Mg ^ and code- 
words in ~ (<5, 9), whose maximum error probabilities 

larger than e occur with probability less than 7. The largest 
code size satisfies 



Pr 



logM„ „- > R + 25\9] < 7 



(9) 



Proof: From the direct part of Theorem l3.1l and Lemma 
lA.ll we have that there exists admissible codes such that 



for all rate R > "^(7, "061101 P)' whenever n > n^. 



n-i logAf, > loggA(^", e -5)- H{Wa\P) - 5, 

(10) 

for all 6* G A and sets A C 8 (having probability at least 
1 — 7). Let be the common (e — (5)-image of minimal 
size 1 1 ^"11 — g^(^", e — 6). Then it is easy to show that 

inf > (e - 6)^. By applying Corollary L2.14 

8eA 



in [5] to this relation and substituting it in (fTOl i. we obtain 

for all 71 > nfid^l, |^|,e,(5), 

n~HogMgj > mfJiP,Wi-\-,9))-2S, (11) 

for all 6* G A, where the last inequality follows from the 
concavity of the entropy function with respect to We. Fi- 
nally, taking the supremum in (fTTT i with respect to all sets 
A C O having probability at least 1 — 7 yields the lower 
bound ([Hi 

logM,^^ > ^(7, 7/>,|,-, §, P)-25>R- 25, (12) 

for all rate R < 'if{'j,ipg,g,9,P) and 6 G A*, which is 
attained by some code with A^ = A* . Next we prove the 
upper bound (|9]l. From the converse part of Theorem 13. II 
we have 

n-i log Af, ^ < n-i logg^^ (^", e + <5) - H{Wa^ \P) + S, 

(13) 

for all 9 e Ae. Since = T" - (6, 9) impHes that 

P\Ot 

any common (e + (5)-image of will be included in 
n '^WeP^^'n)^ Lemma 1.2.12 in [5] ensures that there ex- 

ists n > njj'd \'3^\,e,6) such that, 

n-MoggA^(^/",e + 5) < mi H{WeP) + 5. (14) 

f?G Ac 

Then by applying equation ( fT4l ) to equation ( fT3] ). and then 
by taking its supremum with respect to all sets A C 8 hav- 
ing probability at least 1 — 7, we obtain 

logMg g < ^(7, 9, P) + 26<R + 26, (15) 

for all R > '^(7,i/'g|g,0,P) and 9 e with Pr(6i ^ 
Ae|0) < 7, and this concludes the proof. 

4. Numerical Results and Discussion 

In this section, we illustrate our results via a realistic sin- 
gle user mobile wireless communication system involving a 
Ricean block flat fading channel, where the channel state is 
described by a single fading coefficient. The channel states 
in each block are i.i.d. and unknown at the transmitter and 
the receiver The transmission extends over many blocks 
(coherence intervals) such that the average outage capac- 
ity ([1]) is indeed the appropriate performance criterion. The 
practical significance of this capacity stems from QoS re- 
quirements present in many communication services. 

Within each block, the actual codeword (data) is 
preceded by a length-iV training sequence xt = 
[xq, . . . ,xn-i] of power Pt which is known by the re- 
ceiver This enables maximum likelihood (ML) channel 
estimation of the fading coefficient 9 at the receiver yield- 
ing the estimate 9j^. In many wireless systems, CSI at the 
transmitter 9t has to be provided by the receiver via a feed- 
back/CSIT. This allows the transmitter to perform power 



control V{9b), i.e., allocate more transmit power when the 
estimated channel is good, and less or no power when the 
channel is bad. Below, we consider the following three feed- 
back schemes: (i) no feedback, i.e., absence of CSIT; (ii) 
an instantaneous and unlimited feedback/CSIT {9t ~ ^_r); 
(iii) an instantaneous and rate-limited feedback/CSIT; here 
the CSI is quantized using a quantization codebook which is 
known at the transmitter and the receiver (we construct this 
codebook using the well-known Lloyd-Max algorithm). 

4.1. Channel Model 

The channel model within a block is given by (all quan- 
tities are complex- valued) Y[i\ = H[i\ X[i\ + Z[i\, where 
X\i\ and Y[i] are the discrete-time transmit and receive sig- 
nal, respectively, H is the fading coefficient, and Z[i] ^ 
CA/'(0, cr|) is i.i.d. zero-mean, circularly complex Gaussian 
noise. The transmit signal is subject to the average power 
constraint < P{9t) Eg^{r[9T)] < P- 

The optimum power allocation V{9b) is obtained using 
Lagrange multipliers and the Kuhn-Tucker theorem. The 
channel state 9 = H[i] is assumed to be circularly complex 
Gaussian 9 ^ '>p{9) = CAf(^iJ,ii,2af^y The Rice factor is 

defined as Kh = it'^i ■ The ML estimate 6f> = H[i] is 
obtained by correlating the received signal with the known 
training sequence x^. Its performance can be characterized 
via the pdf iIj{9\9r) ^ CJ\f{p9R + (1 - p)iih, pcr^), where 

p=^^^2.ndal,^al/{NPT). 

W ' h 

For a given estimate 9q, to evaluate Q requires solv- 
ing an optimization problem where we have to deter- 
mine the optimum set A*, and the associated channel state 
9* € A* minimizing mutual information. The estimation- 
induced outage capacity can then be shown to be given by 

Gil, i'e\e^ ^0) = log2 I 1 + ^ ^1 — I , where r* 

is the 7-percentil4i]of ')p(r\9 = ^0) with r = \9\ (for further 
details see [6]). 

4.2. Results and Discussion 

Fig. [T] shows the average estimation-induced outage ca- 
pacity (cf. ([T]i) in bits per channel use for outage proba- 
bility 7 = 0.01 versus the signal-to-noise ratio SNR = 
\lj,h\'^P/(7z for different amounts of training and for unlim- 
ited and absent feedback/CSIT (all numerical results were 
obtained using Monte Carlo simulations). For comparison, 
we show ergodic capacity under perfect CSI. The channel's 
Rice factor was Kk = OdB. It is seen that the average rate 
increases with the amount of CSIR and CSIT. To achieve 

'it can be computed by using the cumulative distribution of a non- 
central chi-square of two degrees of freedom. 



1.5 bit per channel use without feedback/CSIT, it is seen 
that a scheme with estimated CSIR and iV = 3 (V mark- 
ers) requires 5dB, i.e., 4.3 dB more than with perfect CSIR 
(solid line). Whereas if the training length is further re- 
duced to iV = 1 (o markers), this gap increases to 6.4 dB. In 
the case of unlimited feedback (CS1T=CSIT), the SNR re- 
quirements for 1.5 bit per channel use are — 1.3dB (perfect 
CSIR, dashed line), 2.1 dB (estimated CSIR with N = 3, 
* markers), and 3.7 dB (estimated CSIR with = 1, x 
markers), respectively. Thus, with unlimited feedback the 
gap between estimated and perfect CSl is slightly smaller 
than without feedback (3.4 dB and 5dB with iV = 3 and 
TV = 1, respectively). Observe that for values of SNR larger 
than 10 dB similar performance are achieved without feed- 
back/CSIT and N = 3 comparing to a system with unlim- 
ited feedback and = 1. Therefore, using this information 
a system designer may decide to use training sequences of 
length iV = 3 instead of implementing a feedback channel. 

Fig. |2] shows the average estimation-induced outage ca- 
pacity for an outage probability 7 — 0.01 and rate-limited 
feedback/CSIT versus the SNR. We suppose two bits of 
feedback (i?fl,F = 2), and training sequences of length 
N = 3. Observe that at 1.5 bits the gap between the average 
outage capacity without feedback and rate-limited feedback 
is 1 dB for two bits of feedback/CSIT. Whereas the gap re- 
spect to the average outage capacity with unlimited feed- 
back is only 2 dB. 

5. Conclusions 

In this paper we have studied the problem of reliable 
communications over unknown DMCs when the receiver 
and transmitter only know an estimate of the channel state. 
We proposed to characterize the information theoretic limits 
of such scenarios in terms of the novel notion of estimation- 
induced outage capacity. We provided an explicit expres- 
sion for the maximum achievable outage rate in the context 
of an associated coding theorem and its strong converse. We 
used a Ricean fading channel and maximum likehood chan- 
nel estimation to illustrate our approach by computing its 
mean outage capacity. Our results are useful to assess the 
amount of training data and feedback required to achieve a 
target rate satisfying a quality of service constraint. It will 
be attractive to study coding schemes achieving this capac- 
ity because this allows to design communication systems 
with QoS constraints and imperfect channel estimation. 
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A. Auxiliary results 
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Figure 1: Average estimation-induced outage capacity for 
different amounts of training vs. SNR. 
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Figure 2: Average estimation-induced outage capacity with 
two bits of rate-limited feedback (Rfb — 2) vs. SNR. 

This appendix introduces a few concepts and further- 
more provides some auxiliary technical results required for 
the proof of Theorem l2.1l 

Robust Decoders: Let C denote a set of 

transmit sequences and let We(-|x) = W{-\x,9). A set 
^ a^n (depending on A C 6) is called a robust e- 
decoding set for a sequence x G and an unknown DMC 
W{-\x,e) G >Ve,ifPr(VF"(i^"|x,6') > l~e\e) > I-7. 

A set i^" C of receive sequences is called a com- 
mon rj-image (0 < ?/ < 1) of a transmit set C for 
the collection of DMCs >Va, iff inf Ty"(^"|x, 9) > r, for 

all X e Finally, A C is called a confidence set for 
given 9, if Pr(6' ^ A|6') < 7 and 7 is the outage probability. 

Robust I-Typical Sets: A robust I-typical set is defined 
as ^X(X' ^n) = U ^Wb '^")' ■wi^ arbitrary A C 6 and 

(5-sequence {(5„} (cf. [5]). 

Lenuna A.l For any < 7, e < 1, a necessary and suffi- 
cient condition for a robust I-typical set (5„) to be a 
robust e-decoding set with probability 1—7/5 that Abe a 
confidence set. 



Theorem A.2 For any collection of DMCs Wa and asso- 
ciated robust I-typical set l3S\{yi^ Sn) with x e Tp{x, 6n), 
there exists an index no such that for all n > no the size 
[|^]\^(x, 6n)\\ of the robust I-typical set is bounded as 



with C 
e — S. Then 



-\og\mK,S,,)\\-H{WA\P) 



< 



sup H{V\P)andrjn 
yeWA 



as dn 



Here, H{Wa\P) 
and n — > oo. 

Proof: We first show that the size of ^]((x, is 
asymptotically equal to the size of ^g(x) = IJ 7y (x) 

where S = Wa n 7'„(^) is the intersection of Wa with 
the set Vni^) of empirical distributions induced by receive 
sequences of length n. In particular, there exists an index 
no such that for all n > no and x e 

||^S(x)|| < \m^,S„)\\ < (l + n)l^ll^l||^g(x)||. (16) 

The lower bound in ( fTSI l is trivial. We will next establish 
that there exists e„ > such that for all n > no 

U T^i^,Sr,)C U r^(x,6„), (17) 

weWA vgs 
from which the upper bound in ( fTSI l follows via basic prop- 
erties of types (cf. [5]). 

Assume that Wa is a relatively rp-open subset of Wa U 
i.e., every W G Wa has a TQ-neighborhood de- 
fined in the tq -topology [7]. Then there exists no such that 
for any n > 7io and e > 0, the e-openball Uo{W, e) satisfies 
Uo{W,e) n Pni^) C Wa. Choose < e' < e and pick 
an empirical conditional PM V G Pn{'3^) such that for all 
(a, 6) e X ^, \Vib\a) - W{h\a)\ < e'^ and V{b\a) = 
if W^(6|a) = 0. The continuity properties of information di- 
vergences imply that for any sequence y S 7jy(x, Sn) (i.e., 
V{Wn\\W\P„) < Sn), \Wn{b\a)Pn{a) -Wib\a)Pnia)\ < 
^<5„/2and hence \ Wn{b\a)Pn{a) - V{b\a)Pn{a)\ < e' + 
Sn/2. Finally, from this equation it is easy to show that 
there exists an e„ > such that 2?(W„||1/|P„) < e„, i.e., 
y G 'Ty{x, e„). Consequently, for any W € Wa and large 
enough n, it is possible to find V € H and e„ > such that 
7^(x, Sn) Q 7j7(x, e„), thus establishing ( fTTl ). Using sim- 
ilar arguments as above and the uniform continuity of the 
entropy function, it can be shown that there exists nfj such 
that for all n > rig and x S J"" 

-log||^£(x)|| - sup H{V\P) <e„, (18) 
n veWA 

with e« - |jr||^|n-ilog(n+l) + £,'n and ^ as 
n oo. The theorem follows by combining the inequalities 
( fT6l ) and ^ and setting ??„ = ^ + \J%'\\'3^\n-^ log(n+l). 

Proof of Theorem \3J\ To prove the direct part, consider 
an admissible code that is maximal, i.e., it cannot be ex- 
tended by arbitrary (xm+i; !^2i+i) ^^^^ that the extended 



inf M^"(^"|x,,6l) >e-S, 
0eA 



for all Xi & s 
jxi, . . . ,xm|, we have inf 



code remains admissible. Define the set = 



'^((xi, S), and choose S < e such that 1 — e > 

(19) 

As the code is maximal, for all x S \ 

'X \ ^"|x,6') < 1 - e. 

This equation implies that for all 6* G A and large enough n 

iy"(^"|x,6l) > e-<5, (20) 

for all X G \ {xi , . . . , xm } ■ The inequalities ( fT9] l and 
( |20] | together imply that is a common (e — (5)-image 
of the set via the collection of channels Wa. By the 
definition of gj^{£/", e — S) it follows that 

>g^K",e-5). (21) 

On the other hand, C ^]J(xj, (5) impUes that 

<M, ,-exp[7i(i/(WA|P) + <5)], (22) 

for n large enough and all 6 € A, where the last inequality 
follows by applying the cardinality upper bound of Theorem 
IA.2I The lower bound (|6]l is then immediately obtained by 
combining ( |2TI ) and ( l22b . To prove the converse part, let 
be a common (e + (5)-image via the collection of channels 
WA.,i.e., 

inf iy"(^"|x„,6l) > e + formGTW, (23) 

that achieves the minimum in i.e., ||^"|| = 

Sa (=2/", e + <^). For any admissible code, (|5]l and ( |23] ) im- 
ply inf n ^"|x„,6l) > (5 form G A^. Using 
Corollary 1.2.14 in [5], we hence obtain 

p::^,n^ >cxp[n(i?(WAjP)-<5)], (24) 

for n large enough. On the other hand, the decoding sets 
are disjoint and thus 

gj,^{^\e+S) = > M, ,-exp (WaJP) -<5)] , 

where the last inequality follows from iTM . This inequality 
is equivalent to (|7]) and concludes the proof of the theorem. 
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